Empirical study of PROXTONE and PROXTONE$^+$ for Fast Learning of Large Scale Sparse Models

نویسندگان

  • Ziqiang Shi
  • Rujie Liu
چکیده

PROXTONE is a novel and fast method for optimization of large scale non-smooth convex problem [18]. In this work, we try to use PROXTONE method in solving large scale non-smooth non-convex problems, for example training of sparse deep neural network (sparse DNN) or sparse convolutional neural network (sparse CNN) for embedded or mobile device. PROXTONE converges much faster than first order methods, while first order method is easy in deriving and controlling the sparseness of the solutions. Thus in some applications, in order to train sparse models fast, we propose to combine the merits of both methods, that is we use PROXTONE in the first several epochs to reach the neighborhood of an optimal solution, and then use the first order method to explore the possibility of sparsity in the following training. We call such method PROXTONE plus (PROXTONE). Both PROXTONE and PROXTONE are tested in our experiments, and which demonstrate both methods improved convergence speed twice as fast at least on diverse sparse model learning problems, and at the same time reduce the size to 0.5% for DNN models. The source of all the algorithms is available upon request.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Large-scale Inversion of Magnetic Data Using Golub-Kahan Bidiagonalization with Truncated Generalized Cross Validation for Regularization Parameter Estimation

In this paper a fast method for large-scale sparse inversion of magnetic data is considered. The L1-norm stabilizer is used to generate models with sharp and distinct interfaces. To deal with the non-linearity introduced by the L1-norm, a model-space iteratively reweighted least squares algorithm is used. The original model matrix is factorized using the Golub-Kahan bidiagonalization that proje...

متن کامل

Sparse Structured Principal Component Analysis and Model Learning for Classification and Quality Detection of Rice Grains

In scientific and commercial fields associated with modern agriculture, the categorization of different rice types and determination of its quality is very important. Various image processing algorithms are applied in recent years to detect different agricultural products. The problem of rice classification and quality detection in this paper is presented based on model learning concepts includ...

متن کامل

A Modified Empirical Path Loss Model for 4G LTE Network in Lagos, Nigeria

The quality of signal at a particular location is essential to determine the performance of mobile system. The problem of poor network in Lagos, Nigeria needs to be addressed especially now that the attention is toward online learning and meetings. Existing empirical Path Loss (PL) models designed elsewhere are not appropriate for predicting the 4G Long-Term Evolution (LTE) signal in Nigeria. T...

متن کامل

A Comparative Study of Multipole and Empirical Relations Methods for Effective Index and Dispersion Calculations of Silica-Based Photonic Crystal Fibers

In this paper, we present a solid-core Silica-based photonic crystal fiber (PCF) composed of hexagonal lattice of air-holes and calculate the effective index and chromatic dispersion of PCF for different physical parameters using the empirical relations method (ERM). These results are compared with the data obtained from the conventional multipole method (MPM). Our simulation results reveal tha...

متن کامل

Robust Estimation in Linear Regression with Molticollinearity and Sparse Models

‎One of the factors affecting the statistical analysis of the data is the presence of outliers‎. ‎The methods which are not affected by the outliers are called robust methods‎. ‎Robust regression methods are robust estimation methods of regression model parameters in the presence of outliers‎. ‎Besides outliers‎, ‎the linear dependency of regressor variables‎, ‎which is called multicollinearity...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1604.05024  شماره 

صفحات  -

تاریخ انتشار 2016